Website Privacy Preservation for Query Log Publishing
نویسندگان
چکیده
In this paper we study privacy preservation for the publication of search engine query logs. In particular, we introduce a new privacy concern, which is that of website privacy (or business privacy). We define the possible adversaries that could be interested in disclosing website information and the vulnerabilities found in the query log, from which they could benefit. In this work we also detail anonymization techniques to protect website information, and explore the different types of attacks that an adversary could use. We then present a graph-based heuristic to validate the effectiveness of our anonymization method, and perform an experimental evaluation of this approach. Our experimental results show that the query log can be appropriately anonymized against a specific attack for website exposure, by only removing approximately 9% of the total volume of queries and clicked URLs.
منابع مشابه
Ppdp-mlt: K−anonymity Privacy Preservation for Publishing Search Engine Logs
In this paper we investigate the problem of protecting privacy for publishing search engine logs. Search engines play a crucial role in the navigation through the vastness of the Web. Privacy-preserving data publishing (PPDP) provides methods and tools for publishing useful information while preserving data privacy. Recently, PPDP has received considerable attention in research communities, and...
متن کاملLayered Approach for Personalized Search Engine Logs Privacy Preserving
In this paper we examine the problem of defending privacy for publishing search engine logs. Search engines play a vital role in the navigation through the enormity of the Web. Privacy-preserving data publishing (PPDP) provides techniques and tools for publishing helpful information while preserving data privacy. Recently, PPDP has received significant attention in research communities, and sev...
متن کاملTowards Privacy-Preserving Query Log Publishing
It’s an open secret that search engines collect detailed query logs, and sometimes release these data to third parties. While making this wealth of information available provides enormous opportunities for information retrieval and web mining research, it also raises serious concerns about the privacy of individuals. We strongly believe that this data should be published to allow researchers to...
متن کاملPublishing L2TAP Logs to Facilitate Transparency and Accountability
We propose publishing L2TAP privacy logs to facilitate privacy auditing tasks that involve multiple auditors, an increasingly common requirement in the context of social computing and big data driven science. Our proposal utilizes two ontologies, L2TAP and SCIP, designed for deployment in a Linked Data environment. L2TAP provides provenance enabled logging of events. SCIP synthesizes contextual...
متن کاملPrivacy Preserving Web Query Log Publishing: A Survey on Anonymization Techniques
Releasing Web query logs which contain valuable information for research or marketing, can breach the privacy of search engine users. Therefore rendering query logs to limit linking a query to an individual while preserving the data usefulness for analysis, is an important research problem. This survey provides an overview and discussion on the recent studies on this direction.
متن کامل